Extracting relations from traditional Chinese medicine literature via heterogeneous entity networks

نویسندگان

  • Huaiyu Wan
  • Marie-Francine Moens
  • Walter Luyten
  • Xuezhong Zhou
  • Qiaozhu Mei
  • Lu Liu
  • Jie Tang
چکیده

OBJECTIVE Traditional Chinese medicine (TCM) is a unique and complex medical system that has developed over thousands of years. This article studies the problem of automatically extracting meaningful relations of entities from TCM literature, for the purposes of assisting clinical treatment or poly-pharmacology research and promoting the understanding of TCM in Western countries. METHODS Instead of separately extracting each relation from a single sentence or document, we propose to collectively and globally extract multiple types of relations (eg, herb-syndrome, herb-disease, formula-syndrome, formula-disease, and syndrome-disease relations) from the entire corpus of TCM literature, from the perspective of network mining. In our analysis, we first constructed heterogeneous entity networks from the TCM literature, in which each edge is a candidate relation, then used a heterogeneous factor graph model (HFGM) to simultaneously infer the existence of all the edges. We also employed a semi-supervised learning algorithm estimate the model's parameters. RESULTS We performed our method to extract relations from a large dataset consisting of more than 100,000 TCM article abstracts. Our results show that the performance of the HFGM at extracting all types of relations from TCM literature was significantly better than a traditional support vector machine (SVM) classifier (increasing the average precision by 11.09%, the recall by 13.83%, and the F1-measure by 12.47% for different types of relations, compared with a traditional SVM classifier). CONCLUSION This study exploits the power of collective inference and proposes an HFGM based on heterogeneous entity networks, which significantly improved our ability to extract relations from TCM literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

BACKGROUND Traditional Chinese Medicine (TCM), a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to a...

متن کامل

Context-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network

Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...

متن کامل

GRAPH: A Domain Ontology-driven Semantic Graph Auto Extraction System

This paper presents sGRAPH – a domain ontology-driven semantic graph auto extraction system used to discover knowledge from text publications in traditional Chinese medicine. The traditional Chinese medicine language system (TCMLs), composed of an ontology schema and a knowledge base containing 153,692 words and 304,114 relations, is used as the domain ontology. The sGRAPH comprises two compone...

متن کامل

Anti-inflammatory effect of Yu-Ping-Feng-San via TGF-β1 signaling suppression in rat model of COPD

Objective(s): Yu-Ping-Feng-San (YPFS) is a classical traditional Chinese medicine that is widely used for treatment of the diseases in respiratory systems, including chronic obstructive pulmonary disease (COPD) recognized as chronic inflammatory disease. However, the molecular mechanism remains unclear. Here we detected the factors involved in transforming growth factor beta 1 (TGF-β1)/Smad2 si...

متن کامل

A Trainable Method For Extracting Chinese Entity Names And Their Relations

In this paper we propose a trainable method for extracting Chinese entity names and their relations. We view the entire problem as series of classification problems and employ memory-based learning (MBL) to resolve them. Preliminary results show that this method is efficient, flexible and promising to achieve better performance than other existing methods.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 23 2  شماره 

صفحات  -

تاریخ انتشار 2016